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ABSTRACT 

We apply several statistical estimators to high-resolution N-body simulations of two 
currently viable cosmological models: a mixed dark matter model, having Q u — 0.2 contributed 
by two massive neutrinos (C+2^DM), and a Cold Dark Matter model with Cosmological 
Constant (ACDM) with f2o = 0.3 and h = 0.7. Our aim is to compare simulated galaxy samples 
with the Perseus-Pisces redshift survey (PPS). We consider the n-point correlation functions 
(n = 2-4), the V-count probability functions py, including the void probability function P$, 
and the underdensity probability function U e (where e fixes the underdensity threshold in 
percentage of the average). We find that Pq (for which PPS and CfA2 data agree) and Pi 
distinguish efficiently between the models, while U e is only marginally discriminatory. On the 
contrary, the reduced skewness and kurtosis are, respectively, S3 ~ 2.2 and S4 ~ 6-7 in all cases, 
quite independent of the scale, in agreement with hierarchical scaling predictions and estimates 
based on redshift surveys. Among our results, we emphasize the remarkable agreement between 
PPS data and C+2^DM in all the tests performed. In contrast, the above ACDM model has 
serious difficulties in reproducing observational data if galaxies and matter overdensities are 
related in a simple way. 

Subject headings: cosmology: theory - dark matter - galaxies: clustering - large-scale structure 
of the Universe 

1. Introduction 

Although many cosmological models have been considered by various authors, much effort has been 
concentrated on models inspired by the hypotheses of inflation. In such a context, the "natural" choice is 
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to consider COBE normalized models with negligible spatial curvature, Gaussian and adiabatic primordial 
fluctuations and a spectrum close to the Zel'dovich one. 

Among them, models that agree with the available data on large scales, where fluctuations are still in 
the linear regime, deserve further inspection at smaller scales. Critical differences can be expected to exist 
for fluctuations still in a weakly non-linear regime. Here we need statistical tests, which are simultaneously 
robust and discriminatory, to compare real data with N-body simulations. On the contrary, when much 
smaller scales are inspected, it is not clear whether some residual signal coming from the shape of the 
post-recombination spectrum can still be appreciated, and in any case it will be necessary to include 
astrophysics that is still poorly understood, such as star formation and feedback effects. 

In a previous paper (Ghigna et al. 1994, hereafter paper G), we showed that the void probability 
function (VPF), Pq, can be a robust and discriminatory test on the distribution of matter in undcrdense 
regions of the Universe, which are however in a weakly non-linear regime (|<5/o|^ p). In paper G, voids in a 
volume-limited sample of the Perseus-Pisces Survey (PPS, Giovanelli & Haynes 1991) were analysed and 
the function Po(r) was computed. This VPF was then compared with simulations based on the Qq = I 
CDM model and on the CHDM model with fl„ — 0.3 and one massive neutrino (m v ~ 7eV). We found 
that this CHDM model produces too many intermediate-size voids. 

In this paper we extend the comparison between PPS data and simulations by considering new 
models and new tests. We study a ACDM model with f^o = 0.3 and h = 0.7 (hereafter this model will 
be called ACDM0.3), and a mixed dark matter model with f2„ = 0.2, h = 0.5, and 2 massive neutrinos, 
each having a mass of vn v = 2.3 eV (hereafter this mix will be called C+2^DM). The ACDM0.3 model has 
been found to reproduce the power-spectrum shape on intermediate (~ 20ft.~ 1 Mpc) scales (e.g., Peacock 
& Dodds 1994; Borgani et al. 1996), as well as the abundance of galaxy clusters (e.g., Eke et al. 1996, 
and references therein). Klypin, Primack, & Holtzman (1996) showed however that it predicts a too 
strong galaxy clustering on scales 10/i _1 Mpc. As for the C+2^DM model, it was firstly considered by 
Primack et al. (1995). Sharing f2„ between two massive v species decreases the fluctuation amplitude on 
~ 10/i _1 Mpc scales with respect to the standard CHDM model (thus alleviating cluster overproduction), 
without reducing the small-scale (~ 1 /i~ 1 Mpc) power to an unacceptable level for early galaxy formation. 

Among previous works on the VPF, it should be mentioned that Fry et al. (1989) estimated it 
for a preliminary version of PPS and compared the results with CDM N-body simulations. Weinberg 
& Cole (1992) showed that the VPF can also discriminate between Gaussian and non-Gaussian initial 
conditions. Finally, Ghigna et al. (1996) compared the void statistics in the PPS sample analyzed here 
and to simulations of the Broken Scale Invariance (BSI) model (CDM with a characteristic scale in the 
post-inflationary spectrum; see, e.g., Gottlober, Miicket & Starobinski 1994), which was found to agree 
with observations. 

The n-point correlation functions for the PPS, evaluated through the counts-of-neighbours 
technique, were also considered in previous papers (Bonometto et al. 1993 and 1995; Ghigna et al. 1996), 
but their comparison with N-body simulations did not show a strong discriminatory power. Here we work 
them out through a different technique (counts-in-cclls), which however confirms previous results. 

In addition to the n-point correlation functions and the VPF, we address the A-count probability 
functions Pn{t) and the underdensity function U e (r), defined as the probability that a randomly placed 
sphere has a galaxy density below e% of the average. The functions -P/v( r ) (A > 1) and U e (r) were never 
estimated before in samples with depth comparable with PPS. 

PPS results arc given here in the CMB reference frame (at variance with paper G, where the 
volume-limited subsample was worked out in the local group rest frame), and are therefore more suitable 
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for comparison with our simulations. As a matter of fact, however, there is hardly any difference between 
the analyses realized in the two frames (filled circles for LG vs. triangles for CMB in Figure 1). 

Void analyses were also performed on the CfA2 survey (Vogeley, Geller, & Huchra 1991; Vogeley et al. 
1994), the SSRS (Maurogordato, Schaeffer, & da Costa 1992; open circles in Figure 1) and the 1.2 Jy IRAS 
redshift survey (Bouchet et al. 1993). It is remarkable that these results are globally consistent with the 
PPS ones, despite the fact that they refer to samples defined in different ways and differently located in the 
sky. 

A crucial point, when testing cosmological models through the VPF statistics concerns the identification 
of galaxies. Indeed, a change in the efficiency of galaxy formation in underdense regions has an immediate 
impact on Po (Bctancort-Rijo 1990; Einasto et al. 1991; Little & Weinberg 1994). The last authors explored 
three different criteria to identify galaxies in TV-body simulations: (i) as peak-particles of the linear density 
field (e.g., Davis et al. 1985), (ii) using the biasing relation derived by Cen & Ostriker (1993) from their 
CDM hydrodynamic simulations, (iii) as high-density regions in the evolved density field. However, some 
concerns have been raised about whether the linear biasing approach yields the seeds where non-linear 
structures later form (e.g., Katz, Quinn, & Gelb 1993). Furthermore, although physically motivated, the 
Cen & Ostriker results were from CDM simulations performed within a limited dynamical range. For these 
reasons, we decided to identify galaxies as corresponding to high peaks of the evolved density field. However, 
even within this choice, different criteria to fragment overmerged structures into individual objects can be 
proposed (e.g., Gelb & Bertschingcr 1993). As a general criterion, galaxy identification should be required 
to produce the basic observed properties of the galaxy distribution, i.e. their average separation, two-point 
correlation function and, possibly, the observed luminosity function. In the next Section, we will discuss the 
simple technique used here to identify galaxies and compare it with the approach adopted in paper G. 

Based on the simulation outputs, we generate artificial samples in redshift space having the same 
geometry and number of galaxies of the volume-limited sample extracted from PPS. We extract several 
samples from each simulation box corresponding to different viewpoints, so as to obtain an estimate of the 
sky variance within a given real-space volume. 

2. Real and simulated data sets. 

Real data. - The PPS database (Giovanelli & Haynes 1989 and 1991) is limited to the region bound by 
22 h < a < 3 h 10 m , 0° < S < 42° 30' to avoid areas of high galactic extinction. Zwicky magnitudes of all 
galaxies brighter than mz w = 15.5 are however corrected for extinction, by using the absorption maps 
of Burstein & Heiles (1978). The resulting sample includes 3395 galaxies and is virtually 100% complete 
for all morphological types up to mz w — 15.5. Observed velocities are then corrected by subtracting the 
component of our velocity relative to the CMB, therefore putting the observer at rest in the CMB frame. 
A volume-limited subsample (VLS) is then extracted, whose limiting magnitude Mu m = — 19 + 5 log h 
corresponds to a limiting depth of 79ft. -1 Mpc. This sample contains 902 galaxies with mean galaxy 
separation d = 5.5 h^ 1 Mpc. This sample differs from the one used in Paper G, which was obtained by 
setting the observer at rest with respect to the centroid of the Local Group. The presence of a large 
volume, moving coherently with the local group, allowed us to include several middle distance faint galaxies 
in the old volume-limited sample. Hence the total number of galaxies it contained was 1032, and their 
average separation was 5.2 h^ 1 Mpc. The decrease of the number of galaxies and consequent slightly worse 
statistics, is the price to be paid to have full coherence between observed and simulated data. 
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Simulated samples. - We used four different PM simulations obtained evolving 256 3 cold particles on a 800 3 
cell grid. More in detail: (i) One realization of C+2^DM, with an additional 2 x 256 3 hot particles, in a 
box with side I = 50/i _1 Mpc (h = 0.5), normalized to a COBE quadrupole Q rm s-ps = 17/xK and yielding 
<t 8 = 0.67 (Primack ct al. 1995). (ii)-(iii) Two realizations of ACDM . 3 (ACDMx and ACDM 2 ) in a box 
with side / = 50ft. _1 Mpc (h = 0.7), normalized to a COBE quadrupole Q„ns-ps = 21.6 /xK and yielding 
(78 = 1-10. The first one started from the same random numbers as C+2^DM. (iv) A further realization 
of ACDM0.3 in a box with side I = 80/i _1 Mpc (ACDM-80), with the same normalization as above. (The 
ACDM0.3 simulations are from Klypin, Primack & Holtzman 1996). All these models assume a primordial 
spectral index n = 1. 

Let us now discuss the criteria followed to identify galaxies. As in paper G, galaxies are set in 
overdensities exceeding a given threshold, but here the simulation output was preliminarily treated in such 
a way as to provide a direct individuation of overdensity regions. In paper G, we had first found the number 
rip of particles in each cell to single out local density maxima. However, single cells are below the resolution 
allowed by PM codes. So now, the density of each cell has been gauged by considering the sum J2p=i n p 
of the particles contained in a 3 x 3 x 3 cell box centered on each cell. Here, the simulation output, in 
addition to listing coordinates and velocities for DM particles, also gives us directly the density contrast 8 
in a 27-cell volume centered on each of them (actually, we use a large random subsample of particles with 
uniform probability, amounting to a 20% fraction of the total). 

Therefore, we can simply select a priori a threshold density contrast 8th and consider only particles 
with 8 > <5 t h- We considered three values: 8 t h =100, 150, 400. They are large enough to ensure that peaks 
above threshold correspond to virialized structures. The two lowest values are (more and less) conservative 
estimates of the typical density constrast associated with structures becoming virialized at the present 
epoch, while the highest value allows us to significantly perturb this basic distribution of objects. 

The total numbers of particles above the <5 t h selected are still quite large, as expected (about 5-10 % of 
the total). Among them we select a small subset at random (751 particles for the box of side I = 50 h~ 1 Mpc 
and 3077 for the box of side I = 80/i _1 Mpc), in order that the average inter-particle separation is 
c?g a i = 5.5 /i _1 Mpc, i.e. the average galaxy separation in the real volume-limited sample. 

By construction, the surviving particles are located in regions whose overdensities are above the 
thresholds selected and the distributions of those particles inside the parent overdensities automatically fit 
the different density profiles of such regions within the "noise interval" introduced by the randomization 
process, which anyway should be expected to occur in the real world as well. 

In principle, passing from ~ 1/10 of DM particles down to ~ 1/1000 could introduce a bias, namely 
when small overdensity regions are considered. The volume-limited sample extracted from PPS contains 
galaxies with luminosity exceeding L* ~ 10 10 /i~ 2 Lq. Accordingly, overdensity regions whose mass is 
~ l0 12 h~ 1 M &1 and typically yield one or two galaxies, can be casually included or excluded from the 
artificial samples. This point is potentially delicate, especially for measures like VPF, whose output could 
be affected by the inclusion or exclusion of a few isolated galaxies. 

We addressed much care to this point, by building artificial samples from different random choices and 
comparing the outputs of our statistical measures for them, although most results reported in this paper, 
for the sake of homogeneity, come from a single realization. In the next Section we will debate this point 
further. We only anticipate here that the effect of changing the random subset of particles is always quite 
modest, apart from a few cases whose anomaly is apparent. Moreover, the scatter induced by such an effect 
is smaller than that associated to the change of the observer setting within a given realization. The results 
reported were however checked to be typical, by comparing 10 different realizations. 
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As a further check of the robustness of the results based on the above galaxy identification method, we 
implemented in the C+2i/DM simulation two further prescriptions, both starting from the identification of 
DM halos and, therefore, free of this possible source of bias. Such prescriptions were also meant to approach 
the procedure followed in paper G, in spite of the different characteristics of the simulations used here. The 
method of this countercheck is described in the next paragraph and the results are reported in the next 
Section (cf. Figure 7); they confirm the validity of our standard procedure. 

As a starting point to identify halos, we select local density maxima on the grid, whose overdensity 
is greater than 200. Afterwards, we center a sphere on this point, with radius equal to that at which the 
overdensity drops to 200. The center of mass of the cold particles falling within the sphere is then computed 
and used as the starting point for the next iteration. We always find that this procedure converges after 
few iterations. At the end, the mass of the halo is defined as the sum of the masses of all the member 
DM particles. The resulting sample of DM halos is then used to identify galaxies. The two prescriptions 
correspond then to two extreme cases: 

(i) No fragmentation: N ga i = (l/d ga i) 3 = 750 galaxies are identified as the N ga i most massive 

halos. Each halo is then identified with a single galaxy. The resulting halo mass threshold is 
M th - 1-5 x 1O 12 /i" 1 M . 

(ii) Fragmentation: in order to break up halos we follow the same simple prescription described by 

Bonometto et al. (1995). After a halo mass threshold is chosen, the number of galaxies belonging 
to the i-th halo of mass Mj is assumed to be Ni — [Mi/M t h], where [x] denotes the largest integer 
that does not exceed x. Therefore, the resulting mass threshold, M t h — 2.4 x 10 12 /i _1 M Q , is fixed by 
requiring that the total number of galaxies matches N ga i. Fragments are assigned random positions 
within the radius of the parent halo and velocities drawn from a Gaussian distribution having mean 
equal to the halo peculiar velocity and dispersion equal to the rms velocity of the member cold 
particles. 

As already outlined and discussed in paper G, these two prescriptions represent extreme cases within a 
class of fragmentation methods not relying on local anti-biasing. Therefore, although we do not attach to 
them any strong physical motivation, they can be reliably used for bracketing results based on more refined 
approaches. 

At variance with paper G, both the standard procedure used in the present work, and the two latter 
prescriptions, make no recourse to the galaxy luminosity function. In the simplest way, this would require 
one additional parameter, the mass-to-light ratio M/L of overdensity regions, which cannot be easily 
related to the physical M/L of well-defined objects and generally would depend on the resolutions of the 
simulations (see also Ghigna et al. 1996). The outputs are however strictly analogous. In conclusion, 
what we work out are galaxies, located in overdensity regions, with suitable individual velocities which are 
essential to set them in redshift space. Overdensities were verified to be essentially in virial equilibrium. 
Henceforth, the velocity distribution for each region above threshold is quite similar to the one considered 
in paper G, where each cell above threshold was given a total galaxy mass proportional to X^pLi n v an< ^ 
virial equilibrium was explicitly imposed to obtain individual galaxy velocities. As a final consideration, let 
us notice that these procedures, as well as the one adopted in paper G, do not leave room for any form of 
velocity bias. It is however important to notice that, also thanks to that, we were able to keep the number 
of parameters fixing the distribution down to one. 

Data-simulation comparison. - The comparison between real and simulated data is performed in redshift 
space, by extracting from the periodic simulation box (with replication) a volume with the same geometry 
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and measures of the real PPS, with respect to a given observer setting. More details about this operation 
are given in paper G (see also Ghigna et al. 1996). The main question is related to the fact that 
the volume-limited sample has a depth of 79/i -1 Mpc, while the smaller simulation boxes have a side 
I = 50/i _1 Mpc. As already outlined in paper G, this is not a real problem, since our analysis concerns 
scales ^ 13/i _1 Mpc. 

For each simulation and each choice of 5th we considered several observer settings, by selecting at 
random both the location of the observer and the direction of the axis of the volume observed. However, 
for each random setting, we first verified that the galaxy density in the artificial PPS sample differed by 
less than 2% from the expected one (= 902/VVls)- In this way 5 different observer setting were selected 
for each case. As we shall see below, the scatter among observers, which is a measure of the sky-variance, 
is always small and approximately of the same order as bootstrap errors. 

3. Statistical analyses 

We estimate the statistical distribution of galaxies in each sample through the count-in-cell technique. 
We work out the probabilities Pn that a randomly placed cell contains N galaxies. From this we compute 
the moments of counts and obtain the volume-averaged correlation functions £„, after subtracting shot-noise 
contributions (sec Bonometto ct al. 1995, for more details). As in paper G we use spherical cells completely 
contained in the sample boundaries whose radii R are in the range 1-13 /i -1 Mpc and, at each R, we take 

= 2Vvls/Vr spheres randomly distributed in the sample volume. Here Vvls — 1-5 x 10 5 /i _3 Mpc 3 is 
the volume of the sample and Vr = 4nR 3 /3. As suggested by Fry & Gaztanaga (1994), Nr should give 
a sensible estimate of the number of independent cells that can be allocated in the volume Vvls m the 
presence of clustering (therefore the factor 2). This argument works well at least at relatively large scales 
for which &(R)~ 1- At smaller R, underestimating Nr can in principle make the outcome of a measure 
excessively dependent on the set of Nr spheres chosen. We verified that this is not the case, by analyzing 
the PPS sample for 20 different realizations of the positions of the spheres. The small shifts occuring at the 
smallest radii are anyway accounted for by our estimates of errors, which we obtain through the bootstrap 
resampling technique (e.g. Ling, Frenk & Barrow 1986). We consider up to 50 resamplings, even though 
we find rapid convergence and a value of 20 would already provide satisfactory estimates. In the following 
figures, for reasons of clarity, bootstrap errors will be reported only for the observational data, but they also 
affect the results on simulated data, with similar magnitudes. (For a careful analysis of the uncertainties in 
count-in-cell statistics see Colombi, Bouchet & Schaffer 1994, 1995 and Szapudi & Colombi 1996.) 

For each galaxy sample, we worked out £ n (i?) for n = 2,3,4, i.e. variance, skewness and kurtosis 
respectively. As far as £3,4 are concerned, we will refer to the reduced cumulants S3 = £3/^ and S4 = £4/^2 • 

As for the P/vS, we examined them up to N = 5 and U e for e in the range 30-70%, but for N > 2 and 
e > 30% the discriminatory power is virtually absent. Values of e less than 30 are hardly distinguishable 
from P over most of the range of scales considered. For these reasons, we will report results only for P , 
Pi and Uso- 

As mentioned in the Introduction, an important point concerning the general significance of our 
analysis is whether the PPS catalogue provides us with a fair sample of the Universe. Although we cannot 
give an answer to this question, we can at least check its reliability against similar data available in the 
literature for other galaxy surveys. 

In Figure 1 we compare the results of our PPS analysis on ^(R) and VPF both in the CMB (triangles) 
and in the LG (filled circles) frame with that by Vogeley et al. (1994) for a volume-limited subsample of 
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the CfA2 survey, having the same limiting magnitude of the PPS VLS that we consider (open circles; the 
average between northern and southern CfA2 samples is plotted here). In order to be consistent with the 
analysis by Vogeley et al., their results must be compared to those of PPS in the LG frame. Therefore, 
only for the purpose of this comparison, we resort to the same version of the PPS sample as considered in 
paper G. In any case, it turns out that results are very weakly dependent on the frame in which redshifts 
are measured. Errorbars for PPS correspond to 3a bootstrap uncertainties. It is remarkable how close the 
results for the two surveys are over the whole explored scale range, thus indicating that PPS and CfA2 are 
essentially equivalent for the statistical analyses we are considering here. 

In Figure 2 we plot ^(-R) for the four simulations considered at the three ovcrdcnsity thresholds, 
5th— 100, 150, 400 (note that the finite volume of the simulation affects scales ^ 10/i _1 Mpc). The curves 
are obtained by averaging over 5 observer locations. The points (filled circles) refer to the PPS sample and 
their errorbars are 3 a from 20 bootstrap resamplings of PPS data. The C+2z/DM model clearly provides 
a good fit to the observational data, especially for S t h— 150 and an even better fit could be obtained by 
setting S t h~ 180. Let us recall that this is roughly the density contrast expected for a virialized system 
in the approximation of spherical collapse. On the contrary, the artificial galaxy samples that we extract 
from the ACDM0.3 simulations do not reproduce PPS data. Both the amplitude and the slope of £2 are 
not satisfactory. Moreover, the dependence on 5th seems weaker here than for C+2iTJM. The effect of 
sky-variance can be seen in Figure 3, which shows again how hard it is to find an observer setting in 
ACDM0.3 whose sky has a £2 consistent with the PPS one. These difficulties of ACDM0.3 were however 
already known (Klypin, Primack & Holtzman 1996). 

The dependence of S3 on R is shown in Figure 4, where, as usual, errorbars are 3<r for PPS data 
and curves refer to simulations. The Figure reveals a fair agreement of models with observational data 
and predictions from the hierarchical scaling model (HS; see, e.g., Bonometto et al. 1995, and references 
therein), which requires a constant S3. In all cases a satisfactory fit is obtained with S3 ~ 2.2 (values are 
slightly higher for the ACDM0.3 simulations than for C+2z/DM, but the difference is within the errorbars). 
For R > 6/i _1 Mpc the values of S3 decreases, rather abruptly for C+2iTJM and ACDMi, gently in the 
other cases. The significance of this trend is questionable anyway in view of the large uncertainties at these 
(relatively) large scales where the number of sampling spheres is small and there may be effects due to the 
finite size of the simulation box. Let us also recall that C+2fDM and ACDMi have the same initial random 
numbers. Also S4, though rather noisy, is compatible with HS by allowing a fit with a constant value in the 
range 6-7. This rather good agreement of "galaxies" in the simulations with redshift survey results and HS 
confirms and extends the results of Bonometto et al. (1995), and, in turn, can be taken as indication that 
our galaxy identification procedure is a sensible one. 

It should be mentioned that the values of the reduced cumulants, S n = £, n /£,2 ~ 1 > obtained from angular 
samples exceed those obtained from redshift surveys by a factor of ~ 3 (Fry & Gaztanaga 1994, Gaztanaga 
1994; see also Peebles, 1980). The origin of this discrepancy is still unclear. Since the galaxies included in 
angular catalogs span much larger volumes of space than redshift surveys, it could be ascribed to sampling 
effects, i.e. that our local neighbourhood is not a fair sample (Gaztanaga 1994), or finite statistics effects 
(Colombi, Bouchet & Schaeffer 1994 and 1995; Szapudi & Colombi 1996). Indeed, the last authors point out 
that the volumes of current redshift surveys and the number of galaxies they contain appear to be too small 
for a meaningful estimate of the n-point functions. On the other hand, in our previous analysis of CDM and 
CHDM N-body simulations (Bonometto et al. 1995), we found that the S n values are decreasing functions 
of the halo mass cutoff for galaxy identification. Therefore, since projected samples include fainter galaxies, 
which could be less biased tracers of the density field, this can partly account for the discrepancy between 
angular and redshift-space analyses. In any case, if observational data and simulations are compared on 
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strictly similar grounds, as we do, finite statistics should affect results for real and artificial galaxies by the 
same amount. However, it is worth stressing here that the limits of our analysis should not be forgotten 
especially when comparing our results with data from large angular samples. 

Figures 5 and 6 give the results for the void probability function Po(R) for simulated and PPS data. 
Errorbars for observational data are again 3<7 and the solid line represents the expected behaviour for a 
Poisson sample with the same number of objects as the real one. In Figure 5 the VPFs for different 8th are 
plotted. In Figure 6, which also shows the effect of the sky variance, the expected contribution to P coming 
from Poisson noise is subtracted and the resulting difference is divided by the volume V(R) = (47r/3)P 3 . 
Plotting (P — Po, Poisson) /V magnifies the detailed behaviour of the VPF at small R. 

From Figures 5 and 6, it is clear that the C+2^DM model agrees with PPS data at all scales, 
independently of the choice of 8 t h- In contrast, as before for £ 2 , it is difficult for ACDM0.3 to yield an 
observer setting whose sky is also marginally consistent with PPS data, also when the overdensity threshold 
is pushed down to its lowest value 5 t h — 100. 

Figure 7 shows the results obtained for C+2^DM from artificial galaxy samples built starting from 
halo identification. The 2-point function and VPF are shown for the two galaxy identification methods 
described in the previous Section, applied to C+2^DM halos. Note that the effect of halo fragmentation 
is rather limited, especially for Po(P). This is essentially due to the presence of two effects, which act in 
opposite directions in determining the strength of the "galaxy" clustering. On the one hand, breaking up 
halos increases the mass threshold. Therefore, galaxies are identified to correspond to higher peaks of the 
DM density field, which in turn leads to an increase of their clustering. On the other hand, since fragments 
generated by the same halo are assigned different peculiar velocities, redshift-space distortions cause a 
suppression of the clustering. The resulting stability of Pq(R) results can be also appreciated by comparing 
them with Figure 5. This confirms that VPF results are connected with DM composition or model, while 
the method of galaxy identification, within the class we considered here, which is based on local and positive 
biasing, has only a modest relevance. 

In Figure 8 we report the behaviour of Pi(P) for data and models. Observational bootstrap errors 
are fairly wide here, especially at large R. In spite of that, at R < 4/i~ 1 Mpc, ACDM0.3 samples miss 
PPS data, while C+2^DM is once more in good agreement with them. Similar considerations hold 
for the undcrdensity probability function U e , which is illustrated in Figure 9 for a 30% underdensity 
threshold. Notice that, because of the point-like nature of the distribution, U e carries new information 
with respect to Po only when R approaches the average inter-particle separation, precisely when 
R > [300/(47re)] 1 / 3 d gal = 3.42(e/100)" 1 / 3 h^Mpc. This is the reason why we plot results on U 30 only for 
R> 4.5/i" 1 Mpc. 

Figure 10 is finally aimed to illustrate the effects of changing the sampling of galaxies in overdensities 
(we take S t h = 150 in all panels). Here we report the results for P and Pi, whose measure is potentially 
most sensitive to the sampling choice. In each panel the dashed curve shows the scatter between different 
observer settings in the "usual" realization, i.e. the one used to draw the previous figures. We also plot ler 
errorbars that we obtain from such results. Superimposed to them solid lines give the results for two typical 
observer settings "observing" a different subset of particles, thus showing the limited effects of changing 
realization. In fact, the difference between the averages over 5 observers in two different realizations is 
smaller than the difference among observer settings in a single realization by a factor ~ 3-10. 

As a general remark, it can be said that cosmic variance does not appear to play an important role (an 
idea of its effect can be obtained by comparing ACDMi and ACDM 2 ). In contrast, comparing ACDM-80 
with its smaller-box companions, there are non-neglegible differences. Since effects of finite box-size are 
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expected to play a role on large 10/i _1 Mpc), differences on few Mpc scales are unlikely to be directly 
related to the size of the simulation box. For instance, Kauffmann & Melott (1993) pointed out that 
the scaling of the VPF starts feeling the box limits at about L/A. Therefore, any difference on scales 
~ 2-5 /i~ 1 Mpc, where we are mostly able to discriminate between models, seems more likely to be an effect 
of different resolutions: in ACDM-80 the linear cell size is a factor of 1.6 larger and the mass of each DM 
particle is increased by a factor of 1.6 3 ~ 4.1. 

4. Conclusions 

In this work we tested the statistical properties of artificial galaxy samples extracted from high- 
resolution simulations of C+2^DM with Q = 1, fi„ = 0.2, h = 0.5 and ACDM with fi = 0.3, tt A = 0.7, 
h = 0.7 (ACDM0.3) against a similar volume-limited sample of the PPS Survey. Artificial galaxies reside 
in overdensity regions of the evolved density field whose density contrasts are above a suitable threshold 
5th- We showed that, while the reduced skewness S3 yields almost identical results for the two models 
(and so does kurtosis S4 but with larger uncertainties), variance, Pq and Pi are able to discriminate 
efficiently between them (also U e , though marginally). In particular, the C+2^DM model agrees with our 
observational data, while it is quite difficult to find an observer setting from which ACDM0.3 is consistent 
with PPS data. The latter results confirm the analysis of Klypin, Primack & Holtzman (1995) and show 
that the excessive small-scale clustering of ACDM0.3 is apparent in redshift-space as well and makes this 
model hardly viable, at least as long as galaxies follow DM overdensities. In contrast, the analysis of S3 (and 
S4) does not distinguish between the models, as said before, and agrees with PPS data and HS predictions 
with constant values of ~ 2.2 (and 6-7). This extends the results of Bonometto et al. (1995), which also 
found a good agreement of "galaxies" with HS in CDM and CHDM A^-body simulations at variance with 
DM particles. 

The values of S3 and S4 that we found agree with those derived from other redshift surveys, which, 
as is known, are markedly smaller than those obtained from angular samples (see, e.g., Fry & Gaztanaga 
1994). As already stressed before, to address the origin of this discrepancy is beyond the scope of this 
paper. However, we would like to notice here that the remarkable stability of our results seems to indicate 
that sampling effects do not play an important role. If redshift distortions, projection effects and the 
mixing of galaxies of largely different luminosities do not contribute either (see, e.g., Fry & Gaztanaga 
1994 and Gaztanaga 1994, who however used the shallow CfAl sample), the reason could very likely be 
finite statistics effects, which indeed tend to decrease the estimates of the hierarchical coefficients (Colombi, 
Bouchet & Schaffer 1994; see also Szapudi & Colombi 1996). This should not be a cause of concern for our 
analysis since we compare observational data and simulations through "galaxy" samples of equal geometry, 
volume and inter-particle separation. Even smaller effects are expected for the VPF, which has been found 
to be less sensitive to finite-volume effects (Colombi, Bouchet & Schaffer 1995). 

Our analysis shows that ACDM0.3 tends to overproduce low-density regions. This is shown both by Pq 
and C/30. Also, the probability of finding a single galaxy in volumes smaller than ~ 3-4/i _1 Mpc is smaller 
than in the PPS data. These inconsistencies are more or less relevant in various realizations, and depend 
on the threshold selected, but are present everywhere. It seems clear that the galaxy number distribution 
in random spheres is significantly different in ACDM0.3 and in the real world. However, let us add a word 
of caution about our conclusions in view of the limitations of our analysis, especially those related to the 
uncertainties on how galaxies actually form and on the way their real distribution relates to that of DM 
particles. 

As a concluding remark, it is worth pointing out what we have learned here about the ultimate goal of 
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picking up the "final" cosmological model. As for the ACDM models, the one we considered here appears 
to have serious troubles in reproducing the galaxy clustering below lO/i^Mpc. It is however clear that, by 
suitably changing the model parameters one may get substantial improvements (we reserve to a forthcoming 
paper the study of larger simulations of a larger suite of models). As for the class of CHDM models, while 
the model with 1 massive neutrino providing Vl u = 0.3 fails to pass the VPF test (Ghigna et al. 1994), 
C+2^DM with f2„ = 0.2 is in good agreement with all data considered here. Therefore, having one single 
massive neutrino flavour with m„ = 7eV instead of two massive neutrino flavours with m v = 2.3 eV seems 
completely sufficient to alter the void distribution in a detectable way. The remarkable performance of 
the C+2^DM model in this small-scale redshift-space analysis adds to previous favorable results from 
numerical and linear theory calculations (see Primack et al. 1995, Primack 1996). This makes it a good 
candidate to interpret the large scale structure of the Universe. 
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Fig. 1. — Comparison between results from CfA2 survey (open circles; after Vogeley et al. 1994) and PPS 
both in LG (filled circles) and CMB (triangles) frame. Left and right panels are for the variance ^(-R) and 
the VPF P (R), respectively. For reasons of clarity, errorbars are reported only for PPS in LG frame and 
correspond to 3a bootstrap uncertainties. 

Fig. 2. — Variance £ 2 vs. scale R for the set of 800 3 -mesh simulations (continuous curves, which correspond 
to three values of the overdensity threshold 5th as shown in the first panel) and for the PPS sample (filled 
circles; errorbars are 3a bootstrap errors). For each simulation and each S t h, the curves are averages over 
5 artificial samples differing in the observer location. Of the three ACDM0.3 simulations, ACDMi has the 
same initial random numbers as C+2^DM and a 50/i _1 Mpc box, ACDM2 has the same box size but an 
independent set of random numbers, while ACDM80 has a 80ft-~ 1 Mpc box. 

Fig. 3. — Effect of sky-variance on £ 2 - For each simulation, the curves correspond to 5 different observer 
settings. PPS data are also shown as a reference. 

Fig. 4. — Reduced Skewness S3 = £3/^2 as a function of R for simulated samples and PPS. Symbols are as 
in Figure 2. 

Fig. 5. — Void probability function Po vs. R for simulated samples and PPS. Symbols are the same as in 
Figure 2. In each panel, the solid curve is what is expected for a Poissonian distribution of points with 
average separation d ga i- 

Fig. 6. — Here the VPF is plotted after subtracting the poissonian P and then dividing by the volume V(R) 
of a sphere of radius R, which allows us to magnify the small scale behaviour. Each plot shows also Pq(R) for 
five typical different settings in the simulations (dotted lines) and gives an indication of the sky variance. We 
have chosen the S t h,s for which the PoS of the models best approach the observational curve. In the top-left 
panel, the heavy "T" at the bottom sets the boundary of the region where the signal is indistinguishable 
from Poissonian. They are obtained from the 3a scatters among measures for 50 different realizations of the 
Poissonian distribution in the same volume as our samples. 

Fig. 7. — & (R) and Po(R) from artificial samples based on halo identification. Dashed and dottes lines refer 
to no fragmentation and full fragmentation of the DM halos, respectively, and are obtained as average over 
10 realizations of artificial samples. For sake of comparison, the result for PPS (filled circles) is also given. 

Fig. 8. — Results for Pi(R), the probability of counting a galaxy in a sphere of radius R. Symbols as in 
Figure 2. 

Fig. 9.— Results for [/30, the probability that the number density n in a sphere of radius R is less than 30% 
of the average n. 

Fig. 10. — Effects of changing galaxy sampling in overdensities, for Po and Pi. Dashed curves are results 
for different settings in the realization used for the previous figures; la errorbars are worked out from their 
variance. Continuous lines give results for two observer settings within a different realization. Results for 
two models with 5 t h = 150 only are plotted. 
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